Student Team: YES
Approximately how many hours were spent
working on this submission in total?
240 hours
May we post your submission in the
Visual Analytics Benchmark Repository after VAST Challenge 2014 is complete?
YES
Video: TUE-ELZEN-MC2.wmv
Questions
MC2.1 – Describe common daily routines for GAStech employees. What does a day
in the life of a typical GAStech employee look like? Please limit your response to no more than
five images and 300 words.
Based
on the GPS data combined with the credit card transaction data, the average day
of a GAStech employee seems to consist of going to the GAStech office at around
8-9AM each workday. This is usually followed by them going out to lunch at
around 12PM, some preferring the same lunch location each day, while others
vary their choice of location. After about an hour they go back to work and
stay there until they go home at around 5-6PM. Many then go out to
dinner/drinks after which they go home again later in the evening.
This
pattern seems to repeat over the course of the workweek for many of the regular
employees (see Figures 1 and 2).
Figure 1: Typical
representational workday of GAStech employee Calixto, Nils.
Figure 2: Small multiple detail
visualization of typical workday of GAStech employee Calixto, Nils.
It should
also be noted that truck drivers have a notably different daily routine (see
Figure 3). They travel to various locations around town including GAStech. It
is also interesting to note that the truck drivers and the remaining GAStech
employees appear to have very little contact outside of GAStech. This is backed
up by the lack of contact in the credit matrix (see Section MC2.2).
Figure 3: Truck drivers routes
compared to non-truck drivers routes.
MC2.2 – Identify up to twelve unusual events or patterns that you see in the data.
If you identify more than twelve patterns during your analysis, focus your
answer on the patterns you consider to be most important for further
investigation to help find the missing staff members. For each pattern or event
you identify, describe
a.
What
is the pattern or event you observe?
b.
Who is
involved?
c.
What
locations are involved?
d.
When
does the pattern or event take place?
e.
Why is
this pattern or event significant?
f.
What
is your level of confidence about this pattern or event? Why?
Please limit your answer to no more than twelve images and 1500 words.
Several
patterns were discovered by matching credit card transactions between pairs of
GASTech employees and visualizing them as a clustered heat map (see figure
below). Two transactions match when they occur at the same establishment and
within a (user configurable) time span from each other. Moreover, transactions
can be filtered by daily time span (e.g. narrowing transactions down to
evenings) and global time span (e.g. narrowing transactions down to the last
days).
Credit card transaction patterns:
What: Lone truck
drivers
Who: All truck drivers
Where: Mostly airports and manufactures
When: Entire time period
Significance: Little, but does show the deviation (and possible isolation)
of the truck drivers
from
typical employees
Confidence: High, this patterns applies to all truck drivers
What: Engineers stick together and have a tight coffee schedule
Who: Most of the
engineering department, but F. Balas, G. Cazar, V. Frente, A. Calzas,
and
L. Azada in particular
Where: Mostly the `Bean there
done that' establishment
When: 12:00 sharp
Significance: Little
Confidence: High, consistent across
most engineers
What: The security department sticks together, but also includes other employees
Who: The security
department and D. Coginion, L. Lagos, R. Mies Haber, S. Flecha, C.
Lais,
M. Bramar, and B. Tempestad
Where: Mostly Guy's Giros,
Brew've been served, and Hippokampos
When: Around breakfast and
dinner time
Significance: Medium, provides
connections between already suspect security department
and
other GASTech employees. The connection to R. Mies Haber is of particular
interest,
because she is a likely POK relative.
Confidence: High, many consistent
transactions
What: Couple of employees (male and female, same age) visit a hotel in
the afternoon
Who: B. Tempestad and I.
Borrasca
Where: Chostus Hotel
When: January the 10th,
14th, and 17th, around 13:30
Significance: Little
Confidence: High, there are three
separate visits and transaction costs around 100
Spatiotemporal patterns:
What: Security employees appear to be living close toghether.
Who: I. Ferro, L.
Bodrogi, I. Vann, H. Osvaldo
Where: South-east Abila
When: N/A
Significance: High, there must be
close contacts between them.
Confidence: High, derived from
track data.
What: Security employees visit consistently 5 different places (A-E) in
different compositions
that
no other GAStech employee visits.
Who: I. Ferro, L.
Bodrogi, I. Vann, H. Osvaldo, M.Mies
Where: see Figure below
When: January, 7, 8, 9,
10, 11, 14, 15, 17, 18, consistently between 12:30 and 13:30
Significance: High, what do they do
there?
Confidence: High, cross-referenced
with credit-card data
What: Security employees visit (as only employees) the houses of the
executives
Who: I. Ferro, L.
Bodrogi, I. Vann, H. Osvaldo, M.Mies
Where: see Figure above
Significance: High, why do they
visit them?
Confidence: High, no other GAStech employee
pays a visit.
What: GPS signal noise
Who: E. Orilla
Where: Everywhere
When: Always
Significance: Little
Confidence: High
What: GPS signal noise
Who: A. Calzas
Where: Mostly in north
Abila
When: Always
Significance: Little
Confidence: High
What: Executives play golf on sunday
Who: Sanjorge,
Vasco-Pais, Barranco, Strum, Campo-Corrente
Where: Golf course (North
Abila)
When: Sundays
Significance: Little,
however, no-one else plays golf.
Confidence: High.
What: Missing data and living place for Sanjorge (always stays at a
hotel)
Who: S. Sanjorge Jr.
Where: Chostus Hotel
When: Always
Significance: Little
Confidence: High
MC2.3 – Like
most datasets, the data you were provided is imperfect, with possible issues
such as missing data, conflicting data, data of varying resolutions, outliers,
or other kinds of confusing data. Considering MC2 data is
primarily spatiotemporal, describe how you
identified and addressed the uncertainties and conflicts inherent in this data
to reach your conclusions in questions MC2.1 and MC2.2. Please limit your response to no more than
five images and 300 words.
For answering the two questions of this mini challenge, we certainly
had to take uncertainties and conflicts in the spatiotemporal data into
account. Below is an analysis:
From this analysis we know that an interpretation of the data may
cause certain observations to be invalid. Here is how we addressed these
potential issues that come with the data. We started with visualizing all track
data and looked for anomalies, e.g, whether (part of the) tracks are missing at
one or more days. If so, then we could investigate those occurrences further
with creditcard data (see Figures below for an example). Credit card data could
also be used to check for timers that are out of sync, by comparing the time of
the transactions with the time interval that a car is parked.